The advantage is estimating the correlations between the outcomes, as well as the Level 2 coefficients.
Mixed Effects Location Scale Models (MELSM)
Estimate both the mean (location) and variance (scale) of the repeated measures for each of your clusters.
We’ll use data provided by Williams and colleagues (2021.
d <-read_csv("https://raw.githubusercontent.com/sjweston/uobayes/refs/heads/main/files/data/external_data/williams.csv")# scaled time variabled <- d %>%mutate(day01 = (day -2) /max((day -2)))distinct(d, id) %>%count()
# A tibble: 1 × 1
n
<int>
1 193
d %>%count(id) %>%summarise(median =median(n),min =min(n),max =max(n))
# A tibble: 1 × 3
median min max
<int> <int> <int>
1 74 8 99
Code
d %>%count(id) %>%ggplot(aes(x = n)) +geom_histogram(fill ="#1c5253", color ="white") +scale_x_continuous("number of days", limits =c(0, NA))
These models can help us address many questions. First, let’s examine the between- and within-person variance of these outcomes.
Code
post =as_draws_df(m3)post %>%select(between = cor_id__NAstd_Intercept__NAstd_NA_lag, within = rescor__NAstd__PAstd) %>%pivot_longer(everything()) %>%group_by(name) %>%mean_qi()
# A tibble: 2 × 7
name value .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 between 0.0946 -0.103 0.291 0.95 mean qi
2 within -0.0249 -0.0423 -0.00733 0.95 mean qi
To clarify: this is the within-person correlation after accounting for the predictors in the model. This can help us detect any additional patterns in the data.
Code
m3 %>%gather_draws(r_id__NAstd[id, term], r_id__PAstd[id, term]) %>%mutate(.variable=str_remove(.variable, "r_id__")) %>%pivot_wider(names_from = .variable, values_from = .value) %>%# just for presenting to class arrange(.draw, id, term) %>%select(-.chain, -.iteration) %>%mutate(across(c(NAstd, PAstd), \(x) round(x, 2)))
We may even want to see whether there’s a correlation between the effect of positive affect on PA and on NA.
postm2 = m3 %>%spread_draws(r_id__PAstd[id, term], r_id__NAstd[id, term]) postm2 %>%filter(term =="PA_lag") %>% mean_qi %>%ggplot(aes(x = r_id__PAstd, y = r_id__NAstd)) +geom_point() +labs(x ="Effect of PA on PA", y="Effect of PA on NA")
But of course, each of these points is a summary of a distribution! The real benefit of modeling these outcomes jointly is that we can also get a distribution of the correlations between these effects.
# A tibble: 15 × 7
name value .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 PAstd_NA_lag__PAstd_PA_lag 0.318 -0.132 0.681 0.95 mean qi
2 NAstd_Intercept__NAstd_PA_l… 0.303 -0.00369 0.601 0.95 mean qi
3 NAstd_PA_lag__PAstd_NA_lag 0.261 -0.240 0.692 0.95 mean qi
4 PAstd_Intercept__PAstd_NA_l… -0.241 -0.588 0.123 0.95 mean qi
5 NAstd_Intercept__PAstd_PA_l… -0.183 -0.391 0.0356 0.95 mean qi
6 NAstd_Intercept__PAstd_Inte… -0.133 -0.284 0.0217 0.95 mean qi
7 NAstd_PA_lag__PAstd_PA_lag -0.123 -0.479 0.242 0.95 mean qi
8 NAstd_PA_lag__PAstd_Interce… -0.121 -0.419 0.180 0.95 mean qi
9 NAstd_NA_lag__PAstd_PA_lag 0.102 -0.190 0.393 0.95 mean qi
10 PAstd_Intercept__PAstd_PA_l… 0.0950 -0.113 0.296 0.95 mean qi
11 NAstd_Intercept__NAstd_NA_l… 0.0946 -0.103 0.291 0.95 mean qi
12 NAstd_NA_lag__NAstd_PA_lag 0.0685 -0.300 0.405 0.95 mean qi
13 NAstd_Intercept__PAstd_NA_l… -0.0645 -0.393 0.283 0.95 mean qi
14 NAstd_NA_lag__PAstd_NA_lag -0.0334 -0.433 0.383 0.95 mean qi
15 NAstd_NA_lag__PAstd_Interce… -0.00442 -0.223 0.220 0.95 mean qi
modeling variability
Traditionally, psycho-social research has focused on identifying trait-like behaviors - for example, how negative a person is on average. But more recent work has broadened to include states - that is, whether and how much individuals fluctuate in their thoughts, feelings, and behaviors over time.
On the macro time scale, we typically assume stable traits, but on the micro time scale (day to day), we often observe substantial variability in these same traits. This presents a methodological challenge: how do we simultaneously model both the stable personality traits (the “location”) and the fluctuations (the “scale”)? The standard approach has been a two-stage process:
compute individual means (iMs) in a mixed effects model
obtain individual standard deviations (iSDs) from the residuals
But this approach has several drawbacks:
It can result in unreliable estimates
It’s particularly sensitive to the number of measurement occasions
It assumes independence of means and variances, which seems unlikely
It results in biased variance estimates
mixed-effects location scale models
In a standard linear mixed effects model with repeated measurements, we have:
\[
y_i = X_i\beta + Z_ib_i + \epsilon_i
\] Where:
\(y_i\) is the observations for the ith person
\(X_i\beta\) represents the fixed effects
\(Z_ib_i\) represents the random effects (person-specific deviations)
\(\epsilon_i\) represents the error term (or states)
The key limitation of this standard model is that it assumes the same error variance \((\sigma^2)\) for everyone. The MELSM, instead, allows \(\sigma^2\) to differ at the individual level and even at different time points. The scale component is modeled as:
The exponential function ensures positive variance estimates
m4 <-brm(data = d,family = gaussian,bf(# location PA.std ~1+ (1| c | id), # scale (the c in |c| allows for correlation between location and scale) sigma ~1+ (1| c | id) ), prior =c(prior(normal(0, 0.2), class = Intercept),prior(exponential(1), class = sd),prior(normal(0,1), class = Intercept, dpar=sigma),prior(exponential(1), class = sd, dpar=sigma),prior(lkj(2), class = cor)),iter =5000, warmup =1000, chains =4, cores =4,seed =14,file =here("files/models/m82.4"))
m4
Family: gaussian
Links: mu = identity; sigma = log
Formula: PA.std ~ 1 + (1 | c | id)
sigma ~ 1 + (1 | c | id)
Data: d (Number of observations: 13033)
Draws: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
total post-warmup draws = 16000
Multilevel Hyperparameters:
~id (Number of levels: 193)
Estimate Est.Error l-95% CI u-95% CI Rhat
sd(Intercept) 0.81 0.04 0.73 0.90 1.00
sd(sigma_Intercept) 0.43 0.02 0.39 0.48 1.00
cor(Intercept,sigma_Intercept) -0.34 0.07 -0.47 -0.21 1.00
Bulk_ESS Tail_ESS
sd(Intercept) 400 789
sd(sigma_Intercept) 1517 3427
cor(Intercept,sigma_Intercept) 962 1969
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 0.01 0.06 -0.10 0.12 1.03 215 408
sigma_Intercept -0.64 0.03 -0.70 -0.57 1.00 666 1753
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Our posterior distribution contains both locations (means) and scales \(\text{log}(\sigma)\) for each of our clusters (participants). We can plot these to see whether they’re related.
The real power of these models comes when we incorporate predictors for both location and scale.
m5 <-brm(data = d,family = gaussian,bf(# location PA.std ~1+ day + PA_lag + NA_lag + steps.pm*steps.pmd + (1+ day + steps.pmd | c | id), sigma ~1+ PA_lag + NA_lag + steps.pm*steps.pmd + (1+ steps.pmd | c | id) ), prior =c(prior(normal(0, 0.2), class = Intercept),prior(normal(0, 1), class = b),prior(exponential(1), class = sd),prior(normal(0,1), class = Intercept, dpar=sigma),prior(exponential(1), class = sd, dpar=sigma),prior(lkj(2), class = cor)),iter =5000, warmup =1000, chains =4, cores =4,seed =14,file =here("files/models/m82.5"))
Our model clearly has not mixed – be sure to run this model for many iterations if you actually want to use it. As it was, this model took 80 minutes to run on my computer, so I didn’t have time to go back and run for longer.